IN THE CLAIMS 

1 . (Currently Amended) An apparatus, in an integrated circuit (IC) of a data processing 
system having at least one host processor and host memory, comprising: 
a chip interconnect; 

a host interface coupled to the chip interconnect for interfacing the IC with the at least 
one host processor external to the IC; 

a memory interface coupled to the chip interconnect for accessing a memory external 
to the IC, the memory interface including a non-coherent interface for 
interfacing the IC with the host memory external to the IC , the memory 
interface including a coherent interface for interfacing the IC with a cache 
memory external to the IC via the at least one host processor ; 

a memory controller coupled to the chip interconnect for controlling the host memory 
comprising DRAM memory via the memory interface , the memory controller 
to determine whether to access the memory through the coherent interface or 
the non-coherent interface ; 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 
being capabl e of executing instructions to perform scalar data processing; 

a vector processing unit coupled to the chip interconnect, the vector processing unit 
b e ing capabl e of executing instructions to perform vector data processing; and 

an input and output (I/O) interface coupled to the chip interconnect for interfacing the 
IC with an I/O controller of the data processing system, the I/O controller being 
external to the IC for controlling I/O devices of the data processing system, 
wherein the chip interconnect, the memory controller, the scalar processing 
unit, the vector processing unit, the I/O interface, the host interface, and the 
memory interface are implemented within the IC which is a single chipset 
interfacing the at least one host processor and the host memory with other 
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components of the data processing system, including the I/O controller and the 
I/O devices. 

2. (Previously Presented) The apparatus of claim 1, further comprising a switch mechanism 
coupled the chip interconnect and coupled to the scalar processing unit and coupled to the 
vector processing unit, the switch mechanism operable to receive multiple media data streams 
from the I/O interface and dispatch the multiple media data streams to the scalar processing 
unit and/or the vector processing unit. 

3. (Currently Amended) The apparatus of claim 1 , further comprising: 

multiple scalar processing units, the multiple scalar processing units b e ing capable of 
executing instructions to perform scalar processing substantially 
simultaneously; and 

multiple vector processing units, the multiple vector processing units b e ing capabl e of 
executing instructions to perform vector processing substantially 
simultaneously. 

4. (Original) The apparatus of claim 3, further comprising multiple scalar processing 
units of a kind and multiple vector processing unit of a kind. 

5. (Currently Amended) The apparatus of claim 1, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 
a vector register (VR) file coupled to the vector processing unit; and 
a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capabl e of 
executing instructions to load and store vector data from and to the VR. 
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6. (Original) The apparatus of claim 5, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

7. (Original) The apparatus of claim 6, further comprising a direct memory access (DMA) 
engine, the DMA engine transferring the multiple media data between the memory location 
and the host memory. 

8. (Original) The apparatus of claim 5, wherein the LSU is capable of executing 
instructions to load and store various formats of scalar and vector data, wherein the various 
formats comprise 8-bit, 16-bit, and 32-bit formats. 

9. (Previously Presented) The apparatus of claim 1 , wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT controlling and dispatching instructions 
substantially simultaneously. 

10. (Original) The apparatus of claim 9, wherein the instructions comprise very long 
instruction word (VLIW) instructions. 

1 1 . (Original) The apparatus of claim 9, wherein the IUNIT further comprises: 

a program counter; 

a branch unit, wherein the program counter and the branch unit determine the location 

to fetch next instructions; 
an instruction cache memory, the instruction cache memory comprising instruction 

cache tag and data memories for buffering instructions transmitted from the 

host memory; and 
at least one memory mapped registers accessible by the host. 
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12. (Currently Amended) The apparatus of claim 1, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 
and 

an integer shift unit (ISHU), the ISHU being capabl e of executing instructions to 
perform scalar bit shifting and rotating operations; 

13. (Currently Amended) The apparatus of claim 12, wherein the scalar processing unit 
further comprises a floating point unit (FPU), the FPU b e ing capabl e of executing instructions 
to perform high precision scalar data processing. 

14. (Currently Amended) The apparatus of claim 1 , wherein the vector processing unit 
comprises: 

a vector permute unit (VPU), the VPU b e ing capabl e of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capabl e of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU b e ing capabl e of executing 

instructions to perform vector complex integer arithmetic operations; and 
a vector look-up table unit (VLUT) 5 the VLUT being capabl e of executing instructions 

to perform at least one vector table look-up. 

15. (Currently Amended) The apparatus of claim 14, wherein the vector processing unit 
further comprises a vector floating point unit (VFPU), the VFPU b e ing capabl e of 
executing instructions to perform high precision vector data processing. 
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16. (Original) The apparatus of claim 14, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

17. (Original) The apparatus of claim 16, wherein data of the LUT are transferred from the 
host memory to the memory location through a direct memory access (DMA) operation. 

18. (Original) The apparatus of claim 16, wherein the memory location comprises a static 
random access memory (SRAM). 

19. (Original) The apparatus of claim 1, wherein the scalar and vector processing units are 
capable of performing data processing autonomously and asynchronously to the host 
processor. 

20. (Original) The apparatus of claim 1 , wherein the scalar and vector processing units 
communicate with the host processing through an interrupt mechanism. 

21. (Original) The apparatus of claim 1 , wherein the scalar and vector processing units are 
accessible by the host processor, through a set of memory mapped addresses. 

22. (Original) The apparatus of claim 1 , wherein the IC may be a co-processor to the host, 
wherein the IC may be a stand-alone processor coupled to a bus of the data processing 
system, and wherein the chipset may be a core logic chip having a host interface coupled 
to the host processor and memory interface coupled to the host memory. 

23. (Original) The apparatus of claim 5, further comprises a special purpose register (SPR) 
file coupled to the chip interconnect. 
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24. (Currently Amended) A method, in an integrated circuit (IC) having a chip interconnect, 
of a data processing system having at least one host processor and a host memory, the method 
comprising: 

receiving a data stream from an input/output (I/O) interface coupled to the chip 

interconnect, the I/O interface capable of being coupled to an I/O controller 
external to the IC, the I/O controller controlling I/O devices of the data 
processing system external to the IC , the I/O interface coupled with a memory 
interface for accessing a memory external to the IC, the memory interface 
including a non-coherent interface for interfacing the IC with the host memory, 
the memory interface including a coherent interface for interfacing the IC with 
a cache memory via the at least one host processor, the memory interface 
coupled with a memory controller to determine whether to access the memory 
through the coherent interface or the non-coherent interface ; 

examining data of the data stream to determine whether the data requires scalar data 
processing or vector data processing; 

performing scalar data processing on the data in the IC, if the data requires scalar data 
processing; and 

performing vector data processing on the data in the IC, if the data requires vector data 
processing, wherein receiving the data stream, examining the data, the scalar 
data processing, and the vector data processing are performed within the IC 
which is a single chipset interfacing the at least one host processor and the host 
memory with other components of the data processing system including the I/O 
controller and the I/O devices, wherein the at least one host processor, the host 
memory, the I/O controller, and the I/O devices are external to the IC, and 
wherein the I/O controller and the I/O devices communicate with the host 
processor and the host memory via the IC. 
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25. (Original) The method of claim 24, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

26. (Previously Presented) The method of claim 25, further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and multiple 
vector processing units performing vector data processing substantially simultaneously. 

27. (Previously Presented) The method of claim 25, further comprising: 

dispatching the data to the scalar processing unit if the data requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requires vector data 
processing. 

28. (Original) The method of claim 27, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data from 
the I/O interface. 

29. (Currently Amended) The method of claim 28, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT b e ing capabl e of decoding the data. 

30. (Currently Amended) The method of claim 24, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 
a vector register (VR) file coupled to the vector processing unit; 
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a special purpose register (SPR) coupled to the chip interconnect; and 
a load and store unit (LSU), the LSU b e ing capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU b e ing capabl e of 
executing instructions to load and store vector data from and to the VR. 

3 1 . (Original) The method of claim 30, further comprising a memory location coupled to the 
chip interconnect, wherein the LSU loads and stores data from and to the memory location. 

32. (Original) The method of claim 3 1 , further comprising transferring the data between the 
memory location and the host memory, through a direct memory access (DMA) operation. 

33. (Currently Amended) The method of claim 24, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU b e ing capabl e of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU b e ing capabl e of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU b e ing capable of executing instructions to 
perform high precision scalar data processing. 

34. (Currently Amended) The method of claim 24, wherein the vector processing unit 
comprises: 

a vector permute unit (VPU), the VPU being capabl e of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU b e ing capabl e of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
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a vector complex integer unit (VCIU), the VCIU being capabl e of executing 
instructions to perform vector complex integer arithmetic operations; 

a vector look-up table unit (VLUT), the VLUT b e ing capabl e of executing instructions 
to perform at least one vector table look-up; and 

a vector floating point unit (VFPU), the VFPU b e ing capabl e of executing instructions 
to perform high precision vector data processing. 

35. (Original) The method of claim 34, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

36. (Original) The method of claim 35, further comprising transferring data of the LUT from 
the host memory to the memory location, through a direct memory access (DMA) operation. 

37. (Original) The method of claim 24, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

38. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

39. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

40. (Currently Amended) An apparatus, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the apparatus comprising: 
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means for receiving a data stream from an input/output (I/O) interface coupled to the 
chip interconnect, the I/O interface capable of being coupled to an I/O 
controller external to the IC, the I/O controller controlling I/O devices of the 
data processing system external to the IC , the I/O interface coupled with a 
memory interface for accessing a memory external to the IC, the memory 
interface including a non-coherent interface for interfacing the IC with the host 
memory, the memory interface including a coherent interface for interfacing 
the IC with a cache memory via the at least one host processor, the memory 
interface coupled with a memory controller to determine whether to access the 
memory through the coherent interface or the non-coherent interface ; 

means for examining the data to determine whether the data requires scalar data 
processing or vector data processing; 

means for performing scalar data processing on the data in the IC, if the data requires 
scalar data processing; and 

means for performing vector data processing on the data in the IC, if the data requires 
vector data processing, wherein receiving the data stream, examining the data, 
the scalar data processing, and the vector data processing are performed within 
the IC which is a single chipset interfacing the at least one host processor and 
the host memory with other components of the data processing system 
including the I/O controller and the I/O devices, wherein the at least one host 
processor, the host memory, the I/O controller, and the I/O devices are external 
to the IC, and wherein the I/O controller and the I/O devices communicate with 
the host processor and the host memory via the IC. 

41 . (Original) The apparatus of claim 40, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 
performing scalar data processing on the data; and 
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a vector processing unit coupled to the chip interconnect, the vector processing unit 
performing vector data processing on the data. 

42. (Previously Presented) The apparatus of claim 41 , further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and 
multiple vector processing units performing vector data processing substantially 
simultaneously. 

43. (Previously Presented) The apparatus of claim 41, further comprising: 

means for dispatching the data to the scalar processing unit if the data requires scalar 
data processing; and 

means for dispatching the data to the vector processing unit if the data requires vector 
data processing. 

44. (Original) The apparatus of claim 43, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data 
from the I/O interface. 

45. (Currently Amended) The apparatus of claim 44, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT being capabl e of decoding the data. 

46. (Currently Amended) The apparatus of claim 40, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 
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a load and store unit (LSU), the LSU b e ing capabl e of executing instructions to load 
and store scalar data from and to the GPR, and the LSU b e ing capable of 
executing instructions to load and store vector data from and to the VR. 

47. (Original) The apparatus of claim 46, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

48. (Original) The apparatus of claim 47, further comprising means for transferring the data 
between the memory location and the host memory, through a direct memory access (DMA) 
operation. 

49. (Currently Amended) The apparatus of claim 40, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU b e ing capabl e of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU b e ing capabl e of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

50. (Currently Amended) The apparatus of claim 40, wherein the vector processing unit 
comprises: 

a vector permute unit (VPU), the VPU being capabl e of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capabl e of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
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a vector complex integer unit (VCIU), the VCIU being capabl e of executing 
instructions to perform vector complex integer arithmetic operations; 

a vector look-up table unit (VLUT), the VLUT b e ing capabl e of executing instructions 
to perform at least one vector table look-up; and 

a vector floating point unit (VFPU), the VFPU b e ing capable of executing instructions 
to perform high precision vector data processing. 

51 . (Original) The apparatus of claim 50, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

52. (Original) The apparatus of claim 5 1 , further comprising means for transferring data of 
the LUT from the host memory to the memory location, through a direct memory access 
(DMA) operation. 

53. (Original) The apparatus of claim 40, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

54. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

55. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

56. (Currently Amended) A machine readable medium having stored thereon executable 
code which causes a machine to perform a method, in an integrated circuit (IC) having a chip 
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interconnect, of a data processing system having at least one host processor and a host 
memory, the method comprising: 

receiving a data stream from an input/output (I/O) interface coupled to the chip 

interconnect, the I/O interface capable of being coupled to an I/O controller 
external to the IC, the I/O controller controlling I/O devices of the data 
processing system external to the I C the I/O interface coupled with a memory 
interface for accessing a memory external to the IC the memory interface 
including a non-coherent interface for interfacinR the IC with the host memory, 
the memory interface including a coherent interface for interfacing the IC with 
a cache memory via the at least one host processor, the memory interface 
coupled with a memory controller to determine whether to access the memory 
through the coherent interface or the non-coherent interface ; 

examining the data to determine whether the data requires scalar data processing or 
vector data processing; 

performing scalar data processing on the data in the IC, if the data requires scalar data 
processing; and 

performing vector data processing on the data in the IC, if the data requires vector data 
processing, wherein receiving the data stream, examining the data, the scalar 
data processing, and the vector data processing are performed within the IC 
which is a single chipset interfacing the at least one host processor and the host 
memory with other components of the data processing system including the I/O 
controller and the I/O devices, wherein the at least one host processor, the host 
memory, the I/O controller, and the I/O devices are external to the IC, and 
wherein the I/O controller and the I/O devices communicate with the host 
processor and the host memory via the IC. 
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57. (Original) The machine readable medium of claim 56, wherein the method further 
comprises: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

58. (Previously Presented) The machine readable medium of claim 57, wherein the method 
further comprises multiple scalar processing units performing scalar data processing 
substantially simultaneously and multiple vector processing units performing vector data 
processing substantially simultaneously. 

59. (Previously Presented) The machine readable medium of claim 57, wherein the method 
further comprises: 

dispatching the data to the scalar processing unit if the data requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requires vector data 
processing. 

60. (Original) The machine readable medium of claim 59, wherein the dispatching is 
performed by a switch mechanism coupled to the chip interconnect, the switch mechanism 
receiving the data from the I/O interface. 

61 . (Currently Amended) The machine readable medium of claim 60, wherein the switch 
mechanism comprises an instruction unit (IUNIT), the IUNIT b e ing capabl e of decoding the 
data. 
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62. (Currently Amended) The machine readable medium of claim 56, wherein the method 
further comprises: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU b e ing capabl e of 
executing instructions to load and store vector data from and to the VR. 

63. (Original) The machine readable medium of claim 62, wherein the method further 
comprises a memory location coupled to the chip interconnect, wherein the LSU loads and 
stores data from and to the memory location. 

64. (Original) The machine readable medium of claim 63, wherein the method further 
comprises transferring the data between the memory location and the host memory, through a 
direct memory access (DMA) operation. 

65. (Currently Amended) The machine readable medium of claim 56, wherein the scalar 
processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capabl e of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU b e ing capabl e of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU b e ing capabl e of executing instructions to 
perform high precision scalar data processing. 
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66. (Currently Amended) The machine readable medium of claim 56, wherein the vector 
processing unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU b e ing capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU b e ing capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT b e ing capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU b e ing capabl e of executing instructions 

to perform high precision vector data processing. 

67. (Original) The machine readable medium of claim 66, wherein the VLUT comprises a 
memory location storing at least one look-up table (LUT). 

68. (Original) The machine readable medium of claim 67, wherein the method further 
comprises transferring data of the LUT from the host memory to the memory location, 
through a direct memory access (DMA) operation. 

69. (Original) The machine readable medium of claim 56, wherein the scalar data processing 
and vector data processing are performed autonomously and asynchronously to the host 
processor. 

70. (Original) The machine readable medium of claim 57, wherein the scalar processing unit 
and the vector processing unit communicate with the host processor through an interrupt 
mechanism. 
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7L (Original) The machine readable medium of claim 57, wherein the scalar processing unit 
and the vector processing unit are accessible by the host processing through a set of memory 
mapped addresses. 
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